280 research outputs found

    Length, Protein-Protein Interactions, and Complexity

    Full text link
    The evolutionary reason for the increase in gene length from archaea to prokaryotes to eukaryotes observed in large scale genome sequencing efforts has been unclear. We propose here that the increasing complexity of protein-protein interactions has driven the selection of longer proteins, as longer proteins are more able to distinguish among a larger number of distinct interactions due to their greater average surface area. Annotated protein sequences available from the SWISS-PROT database were analyzed for thirteen eukaryotes, eight bacteria, and two archaea species. The number of subcellular locations to which each protein is associated is used as a measure of the number of interactions to which a protein participates. Two databases of yeast protein-protein interactions were used as another measure of the number of interactions to which each \emph{S. cerevisiae} protein participates. Protein length is shown to correlate with both number of subcellular locations to which a protein is associated and number of interactions as measured by yeast two-hybrid experiments. Protein length is also shown to correlate with the probability that the protein is encoded by an essential gene. Interestingly, average protein length and number of subcellular locations are not significantly different between all human proteins and protein targets of known, marketed drugs. Increased protein length appears to be a significant mechanism by which the increasing complexity of protein-protein interaction networks is accommodated within the natural evolution of species. Consideration of protein length may be a valuable tool in drug design, one that predicts different strategies for inhibiting interactions in aberrant and normal pathways.Comment: 13 pages, 5 figures, 2 tables, to appear in Physica

    Drilling operations for the South Pole Ice Core (SPICEcore) project

    Get PDF
    Over the course of the 2014/15 and 2015/16 austral summer seasons, the South Pole Ice Core project recovered a 1751 m deep ice core at the South Pole. This core provided a high-resolution record of paleoclimate conditions in East Antarctica during the Holocene and late Pleistocene. The drilling and core processing were completed using the new US Intermediate Depth Drill system, which was designed and built by the US Ice Drilling Program at the University of Wisconsin–Madison. In this paper, we present and discuss the setup, operation, and performance of the drill system

    Big data and other challenges in the quest for orthologs

    Get PDF
    Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third ‘Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking. Availability and implementation: All such materials are available at http://questfororthologs.org. Contact: [email protected] or [email protected]

    Misregulation of cell cycle-dependent methylation of budding yeast CENP-A contributes to chromosomal instability.

    Get PDF
    Centromere (CEN) identity is specified epigenetically by specialized nucleosomes containing evolutionarily conserved CEN-specific histone H3 variant CENP-A (Cse4 in Saccharomyces cerevisiae, CENP-A in humans), which is essential for faithful chromosome segregation. However, the epigenetic mechanisms that regulate Cse4 function have not been fully defined. In this study, we show that cell cycle-dependent methylation of Cse4-R37 regulates kinetochore function and high-fidelity chromosome segregation. We generated a custom antibody that specifically recognizes methylated Cse4-R37 and showed that methylation of Cse4 is cell cycle regulated with maximum levels of methylated Cse4-R37 and its enrichment at the CEN chromatin occur in the mitotic cells. Methyl-mimic cse4-R37F mutant exhibits synthetic lethality with kinetochore mutants, reduced levels of CEN-associated kinetochore proteins and chromosome instability (CIN), suggesting that mimicking the methylation of Cse4-R37 throughout the cell cycle is detrimental to faithful chromosome segregation. Our results showed that SPOUT methyltransferase Upa1 contributes to methylation of Cse4-R37 and overexpression of UPA1 leads to CIN phenotype. In summary, our studies have defined a role for cell cycle-regulated methylation of Cse4 in high-fidelity chromosome segregation and highlight an important role of epigenetic modifications such as methylation of kinetochore proteins in preventing CIN, an important hallmark of human cancers

    Нелінійна динаміка — 2013

    Get PDF
    The book of Proceedings includes extended abstracts of presentations on the Fourth International conference on nonlinear dynamics

    Big data and other challenges in the quest for orthologs.

    Get PDF
    Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking. AVAILABILITY AND IMPLEMENTATION: All such materials are available at http://questfororthologs.org. CONTACT: [email protected] or [email protected]

    COVID-19, A Global Health Concern Requiring Science-Based Solutions

    Get PDF
    Scientifically-based concrete action points to reduce the spread, lessen the impact, reduce the concerns of the wider population, and avoid further outbreaks for governments, organizations, and individuals are neededFinal Published versio

    Towards Alignment Independent Quantitative Assessment of Homology Detection

    Get PDF
    Identification of homologous proteins provides a basis for protein annotation. Sequence alignment tools reliably identify homologs sharing high sequence similarity. However, identification of homologs that share low sequence similarity remains a challenge. Lowering the cutoff value could enable the identification of diverged homologs, but also introduces numerous false hits. Methods are being continuously developed to minimize this problem. Estimation of the fraction of homologs in a set of protein alignments can help in the assessment and development of such methods, and provides the users with intuitive quantitative assessment of protein alignment results. Herein, we present a computational approach that estimates the amount of homologs in a set of protein pairs. The method requires a prevalent and detectable protein feature that is conserved between homologs. By analyzing the feature prevalence in a set of pairwise protein alignments, the method can estimate the number of homolog pairs in the set independently of the alignments' quality. Using the HomoloGene database as a standard of truth, we implemented this approach in a proteome-wide analysis. The results revealed that this approach, which is independent of the alignments themselves, works well for estimating the number of homologous proteins in a wide range of homology values. In summary, the presented method can accompany homology searches and method development, provides validation to search results, and allows tuning of tools and methods

    Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization

    Get PDF
    Recent years have witnessed much progress in computational modelling for protein subcellular localization. However, the existing sequence-based predictive models demonstrate moderate or unsatisfactory performance, and the gene ontology (GO) based models may take the risk of performance overestimation for novel proteins. Furthermore, many human proteins have multiple subcellular locations, which renders the computational modelling more complicated. Up to the present, there are far few researches specialized for predicting the subcellular localization of human proteins that may reside in multiple cellular compartments. In this paper, we propose a multi-label multi-kernel transfer learning model for human protein subcellular localization (MLMK-TLM). MLMK-TLM proposes a multi-label confusion matrix, formally formulates three multi-labelling performance measures and adapts one-against-all multi-class probabilistic outputs to multi-label learning scenario, based on which to further extends our published work GO-TLM (gene ontology based transfer learning model for protein subcellular localization) and MK-TLM (multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization) for multiplex human protein subcellular localization. With the advantages of proper homolog knowledge transfer, comprehensive survey of model performance for novel protein and multi-labelling capability, MLMK-TLM will gain more practical applicability. The experiments on human protein benchmark dataset show that MLMK-TLM significantly outperforms the baseline model and demonstrates good multi-labelling ability for novel human proteins. Some findings (predictions) are validated by the latest Swiss-Prot database. The software can be freely downloaded at http://soft.synu.edu.cn/upload/msy.rar
    corecore